智能论文笔记

Zero-Shot Motor Health Monitoring by Blind Domain Transition

Serkan Kiranyaz , Ozer Can Devecioglu , Amir Alhams , Sadok Sassi , Turker Ince , Osama Abdeljaber , Onur Avci , Moncef Gabbouj

分类：机器学习 | 人工智能

2022-12-12

Continuous long-term monitoring of motor health is crucial for the early detection of abnormalities such as bearing faults (up to 51% of motor failures are attributed to bearing faults). Despite numerous methodologies proposed for bearing fault detection, most of them require normal (healthy) and abnormal (faulty) data for training. Even with the recent deep learning (DL) methodologies trained on the labeled data from the same machine, the classification accuracy significantly deteriorates when one or few conditions are altered. Furthermore, their performance suffers significantly or may entirely fail when they are tested on another machine with entirely different healthy and faulty signal patterns. To address this need, in this pilot study, we propose a zero-shot bearing fault detection method that can detect any fault on a new (target) machine regardless of the working conditions, sensor parameters, or fault characteristics. To accomplish this objective, a 1D Operational Generative Adversarial Network (Op-GAN) first characterizes the transition between normal and fault vibration signals of (a) source machine(s) under various conditions, sensor parameters, and fault types. Then for a target machine, the potential faulty signals can be generated, and over its actual healthy and synthesized faulty signals, a compact, and lightweight 1D Self-ONN fault detector can then be trained to detect the real faulty condition in real time whenever it occurs. To validate the proposed approach, a new benchmark dataset is created using two different motors working under different conditions and sensor locations. Experimental results demonstrate that this novel approach can accurately detect any bearing fault achieving an average recall rate of around 89% and 95% on two target machines regardless of its type, severity, and location.

translated by 谷歌翻译

Zero-Shot Transfer Learning for Structural Health Monitoring using Generative Adversarial Networks and Spectral Mapping

Mohammad Hesam Soleimani-Babakamali , Roksana Soleimani-Babakamali , Kourosh Nasrollahzadeh , Onur Avci , Serkan Kiranyaz , Ertugrul Taciroglu

分类：机器学习

2022-12-07

Gathering properly labelled, adequately rich, and case-specific data for successfully training a data-driven or hybrid model for structural health monitoring (SHM) applications is a challenging task. We posit that a Transfer Learning (TL) method that utilizes available data in any relevant source domain and directly applies to the target domain through domain adaptation can provide substantial remedies to address this issue. Accordingly, we present a novel TL method that differentiates between the source's no-damage and damage cases and utilizes a domain adaptation (DA) technique. The DA module transfers the accumulated knowledge in contrasting no-damage and damage cases in the source domain to the target domain, given only the target's no-damage case. High-dimensional features allow employing signal processing domain knowledge to devise a generalizable DA approach. The Generative Adversarial Network (GAN) architecture is adopted for learning since its optimization process accommodates high-dimensional inputs in a zero-shot setting. At the same time, its training objective conforms seamlessly with the case of no-damage and damage data in SHM since its discriminator network differentiates between real (no damage) and fake (possibly unseen damage) data. An extensive set of experimental results demonstrates the method's success in transferring knowledge on differences between no-damage and damage cases across three strongly heterogeneous independent target structures. The area under the Receiver Operating Characteristics curves (Area Under the Curve - AUC) is used to evaluate the differentiation between no-damage and damage cases in the target domain, reaching values as high as 0.95. With no-damage and damage cases discerned from each other, zero-shot structural damage detection is carried out. The mean F1 scores for all damages in the three independent datasets are 0.978, 0.992, and 0.975.

translated by 谷歌翻译

Generative Adversarial Networks for Labeled Data Creation for Structural Damage Detection

Furkan Luleci , F. Necati Catbas , Onur Avci

分类：机器学习 | (统计)机器学习

2021-12-07

在过去的几十年中，数据科学领域已经存在着激烈的进展，而其他学科则不断受益于此。结构健康监测（SHM）是使用人工智能（AI）的那些领域之一，例如机器学习（ML）和深度学习（DL）算法，用于基于所收集的数据的民用结构的条件评估。 ML和DL方法需要大量的培训程序数据;但是，在SHM中，来自民间结构的数据收集非常详尽;特别是获得有用的数据（相关数据损坏）可能非常具有挑战性。本文使用1-D Wasserstein深卷积生成的对抗网络，使用梯度惩罚（1-D WDCGAN-GP）进行合成标记的振动数据生成。然后，通过使用1-D深卷积神经网络（1-D DCNN）来实现在不同级别的合成增强振动数据集的结构损伤检测。损伤检测结果表明，1-D WDCAN-GP可以成功地利用以解决基于振动的民用结构的损伤诊断数据稀缺。关键词：结构健康监测（SHM），结构损伤诊断，结构损伤检测，1-D深卷积神经网络（1-D DCNN），1-D生成对抗网络（1-D GAN），深卷积生成的对抗网络（ DCGAN），Wassersein生成的对抗性网络具有梯度惩罚（WAN-GP）

translated by 谷歌翻译

TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering

Meryem Banu Cavlak , Gagandeep Singh , Mohammed Alser , Can Firtina , Joël Lindegger , Mohammad Sadrosadati , Nika Mansouri Ghiasi , Can Alkan , Onur Mutlu

分类：人工智能 | 机器学习

2022-12-09

Basecalling is an essential step in nanopore sequencing analysis where the raw signals of nanopore sequencers are converted into nucleotide sequences, i.e., reads. State-of-the-art basecallers employ complex deep learning models to achieve high basecalling accuracy. This makes basecalling computationally-inefficient and memory-hungry; bottlenecking the entire genome analysis pipeline. However, for many applications, the majority of reads do no match the reference genome of interest (i.e., target reference) and thus are discarded in later steps in the genomics pipeline, wasting the basecalling computation. To overcome this issue, we propose TargetCall, the first fast and widely-applicable pre-basecalling filter to eliminate the wasted computation in basecalling. TargetCall's key idea is to discard reads that will not match the target reference (i.e., off-target reads) prior to basecalling. TargetCall consists of two main components: (1) LightCall, a lightweight neural network basecaller that produces noisy reads; and (2) Similarity Check, which labels each of these noisy reads as on-target or off-target by matching them to the target reference. TargetCall filters out all off-target reads before basecalling; and the highly-accurate but slow basecalling is performed only on the raw signals whose noisy reads are labeled as on-target. Our thorough experimental evaluations using both real and simulated data show that TargetCall 1) improves the end-to-end basecalling performance of the state-of-the-art basecaller by 3.31x while maintaining high (98.88%) sensitivity in keeping on-target reads, 2) maintains high accuracy in downstream analysis, 3) precisely filters out up to 94.71% of off-target reads, and 4) achieves better performance, sensitivity, and generality compared to prior works. We freely open-source TargetCall at https://github.com/CMU-SAFARI/TargetCall.

translated by 谷歌翻译

PALMER: Perception-Action Loop with Memory for Long-Horizon Planning

Onur Beker , Mohammad Mohammadi , Amir Zamir

分类：机器人 | 人工智能 | 计算机视觉 | 机器学习

2022-12-08

To achieve autonomy in a priori unknown real-world scenarios, agents should be able to: i) act from high-dimensional sensory observations (e.g., images), ii) learn from past experience to adapt and improve, and iii) be capable of long horizon planning. Classical planning algorithms (e.g. PRM, RRT) are proficient at handling long-horizon planning. Deep learning based methods in turn can provide the necessary representations to address the others, by modeling statistical contingencies between observations. In this direction, we introduce a general-purpose planning algorithm called PALMER that combines classical sampling-based planning algorithms with learning-based perceptual representations. For training these perceptual representations, we combine Q-learning with contrastive representation learning to create a latent space where the distance between the embeddings of two states captures how easily an optimal policy can traverse between them. For planning with these perceptual representations, we re-purpose classical sampling-based planning algorithms to retrieve previously observed trajectory segments from a replay buffer and restitch them into approximately optimal paths that connect any given pair of start and goal states. This creates a tight feedback loop between representation learning, memory, reinforcement learning, and sampling-based planning. The end result is an experiential framework for long-horizon planning that is significantly more robust and sample efficient compared to existing methods.

translated by 谷歌翻译

NEON: Enabling Efficient Support for Nonlinear Operations in Resistive RAM-based Neural Network Accelerators

Aditya Manglik , Minesh Patel , Haiyu Mao , Behzad Salami , Jisung Park , Lois Orosa , Onur Mutlu

分类：人工智能 | 机器学习 | 神经与进化计算

2022-11-10

Resistive Random-Access Memory (RRAM) is well-suited to accelerate neural network (NN) workloads as RRAM-based Processing-in-Memory (PIM) architectures natively support highly-parallel multiply-accumulate (MAC) operations that form the backbone of most NN workloads. Unfortunately, NN workloads such as transformers require support for non-MAC operations (e.g., softmax) that RRAM cannot provide natively. Consequently, state-of-the-art works either integrate additional digital logic circuits to support the non-MAC operations or offload the non-MAC operations to CPU/GPU, resulting in significant performance and energy efficiency overheads due to data movement. In this work, we propose NEON, a novel compiler optimization to enable the end-to-end execution of the NN workload in RRAM. The key idea of NEON is to transform each non-MAC operation into a lightweight yet highly-accurate neural network. Utilizing neural networks to approximate the non-MAC operations provides two advantages: 1) We can exploit the key strength of RRAM, i.e., highly-parallel MAC operation, to flexibly and efficiently execute non-MAC operations in memory. 2) We can simplify RRAM's microarchitecture by eliminating the additional digital logic circuits while reducing the data movement overheads. Acceleration of the non-MAC operations in memory enables NEON to achieve a 2.28x speedup compared to an idealized digital logic-based RRAM. We analyze the trade-offs associated with the transformation and demonstrate feasible use cases for NEON across different substrates.

translated by 谷歌翻译

Hyperbolic Centroid Calculations for Text Classification

Aydın Gerek , Cüneyt Ferahlar , Bilge Şipal Sert , Mehmet Can Yüney , Onur Taşdemir , Zeynep Billur Kalafat , Mert Kelkit , Murat Can Ganiz

分类：自然语言处理

2022-11-08

A new development in NLP is the construction of hyperbolic word embeddings. As opposed to their Euclidean counterparts, hyperbolic embeddings are represented not by vectors, but by points in hyperbolic space. This makes the most common basic scheme for constructing document representations, namely the averaging of word vectors, meaningless in the hyperbolic setting. We reinterpret the vector mean as the centroid of the points represented by the vectors, and investigate various hyperbolic centroid schemes and their effectiveness at text classification.

translated by 谷歌翻译

A Comparative Analysis of the Face Recognition Methods in Video Surveillance Scenarios

Eker Onur , Bal Murat

分类：计算机视觉 | 人工智能

2022-11-05

Facial recognition is fundamental for a wide variety of security systems operating in real-time applications. In video surveillance based face recognition, face images are typically captured over multiple frames in uncontrolled conditions; where head pose, illumination, shadowing, motion blur and focus change over the sequence. We can generalize that the three fundamental operations involved in the facial recognition tasks: face detection, face alignment and face recognition. This study presents comparative benchmark tables for the state-of-art face recognition methods by testing them with same backbone architecture in order to focus only on the face recognition solution instead of network architecture. For this purpose, we constructed a video surveillance dataset of face IDs that has high age variance, intra-class variance (face make-up, beard, etc.) with native surveillance facial imagery data for evaluation. On the other hand, this work discovers the best recognition methods for different conditions like non-masked faces, masked faces, and faces with glasses.

translated by 谷歌翻译

E-VFIA : Event-Based Video Frame Interpolation with Attention

Onur Selim Kılıç , Ahmet Akman , A. Aydın Alatan

分类：计算机视觉

2022-09-19

视频框架插值（VFI）是一项基本视觉任务，旨在综合两个连续的原始视频图像之间的几个帧。大多数算法旨在通过仅使用密钥帧来完成VFI，这是一个错误的问题，因为密钥帧通常不会对场景中对象的轨迹产生任何准确的精度。另一方面，基于事件的摄像机在视频的关键帧之间提供了更精确的信息。一些最新的基于事件的最新方法通过利用事件数据来更好地解决此问题，以更好地进行光流估计来通过翘曲插值视频框架。尽管如此，这些方法严重遭受了重影效果。另一方面，仅使用框架作为输入的一些基于内核的VFI方法表明，在用变压器备份时，可变形的卷积可能是处理长期依赖关系的可靠方法。我们提出了基于事件的视频框架插值，并作为一种基于轻质核的方法（E-VFIA）。 E-VFIA通过可变形的卷积将事件信息与标准视频帧融合在一起，以生成高质量的插值框架。所提出的方法表示具有高时间分辨率的事件，并使用多头发项机制来更好地编码基于事件的信息，同时不太容易受到模糊和鬼影的影响；因此，产生更脆的框架。仿真结果表明，该提出的技术优于当前最新方法（基于框架和事件），其模型大小明显较小。

translated by 谷歌翻译

Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud

Geraldo F. Oliveira , Juan Gómez-Luna , Saugata Ghose , Amirali Boroumand , Onur Mutlu

分类：机器学习

2022-09-19

神经网络（NNS）的重要性和复杂性正在增长。神经网络的性能（和能源效率）可以通过计算或内存资源约束。在内存阵列附近或内部放置计算的内存处理（PIM）范式是加速内存绑定的NNS的可行解决方案。但是，PIM体系结构的形式各不相同，其中不同的PIM方法导致不同的权衡。我们的目标是分析基于NN的性能和能源效率的基于DRAM的PIM架构。为此，我们分析了三个最先进的PIM架构：（1）UPMEM，将处理器和DRAM阵列集成到一个2D芯片中；（2）Mensa，是针对边缘设备量身定制的基于3D堆栈的PIM架构；（3）Simdram，它使用DRAM的模拟原理来执行位序列操作。我们的分析表明，PIM极大地受益于内存的NNS：（1）UPMEM在GPU需要内存过度按要求的通用矩阵 - 矢量乘数内核时提供23x高端GPU的性能；（2）Mensa在Google Edge TPU上提高了3.0倍和3.1倍的能源效率和吞吐量，用于24个Google Edge NN型号；（3）SIMDRAM在三个二进制NNS中以16.7倍/1.4倍的速度优于CPU/GPU。我们得出的结论是，由于固有的建筑设计选择，NN模型的理想PIM体系结构取决于模型的独特属性。

translated by 谷歌翻译